An Incremental Learning Method for Data Mining from Large Databases
نویسندگان
چکیده
declare that this thesis contains no material which has been accepted for the award of any other degree or diploma in any tertiary institution. To my knowledge and belief, this thesis contains no material previously published or written by another person except where due reference is made in the text of the thesis. Abstract Knowledge Discovery techniques seek to find new information about a domain through a combination of existing domain knowledge and data examples from the domain. These techniques can either be manually performed by an expert, or automated using software algorithms (Machine Learning). However some domains, such as the clinical field of Lung Function testing, contain volumes of data too vast and detailed for manual analysis to be effective, and existing knowledge too complex for Machine Learning algorithms to be able to adequately discover relevant knowledge. In many cases this data is also unclassified, with no previous analysis having been performed. A better approach for these domains might be to involve a human expert, taking advantage of their expertise to guide the process, and to use Machine Learning techniques to assist the expert in discovering new and meaningful relationships in the data. It is hypothesised that Knowledge Acquisition methods would provide a strong basis for such a Knowledge Discovery method, particularly methods which can provide incremental verification and validation of knowledge as it is obtained. This study examines how the MCRDR (Multiple Classification Ripple-Down Rules) Knowledge Acquisition process can be adapted to develop a new Knowledge Discovery method, Exposed MCRDR, and tests this method in the domain of Lung Function. Preliminary results suggest that the EMCRDR method can be successfully applied to discover new knowledge in a complex domain, and reveal many potential areas of study and development for the MCRDR method. iii Acknowledgements
منابع مشابه
An Efficient Mining Method for Incremental Updation in Large Databases
The database used in mining for knowledge discovery is dynamic in nature. Data may be updated and new transactions may be added over time. As a result, the knowledge discovered from such databases is also dynamic. Incremental mining techniques have been developed to speed up the knowledge discovery process by avoiding re-learning of rules from the old data. To maintain the large itemsets agains...
متن کاملA general mining method for incremental updation in large databases
The database used for knowledge discove y is dynamic in nature. Data may be updated and new transactiom may be added ouer time. As a msult, the knowledge discovered from such databases is also dynamic. Incremental mining techniques have been developed to speed up the knowledge discovery PTOcess by avoiding re-learning of rules from the old data. To maintain the large itemsets against the update...
متن کاملQuantifying the Utility of the Past in Mining Large Databases
Incremental mining algorithms that can efficiently derive the current, mining output by ut,ilizing previous mining results are attractive to business organizations since data mining is typically a resource-intensive recurring activity. In this paper, we present the DELTA algorithm for the robust and efficient incremental mining of association rules on large market basket databases. DELTA guaran...
متن کاملارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متنکاوی در حوزه یادگیری الکترونیکی
As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...
متن کاملAn Adaptive Algorithm for Incremental Mining of Association Rules
The association rules represent an important class of knowledge that can be discovered from data warehouses. Current research efforts are focused on inventing efficient ways of discovering these rules from large databases. As databases grow, the discovered rules need to be verified and new rules need to be added to the knowledge base. Since mining afresh every time the database grows is ineffic...
متن کاملQuantifying the Utility of the Past in Mining Large
| Incremental mining algorithms that can eeciently derive the current mining output by utilizing previous mining results are attractive to business organizations since data mining is typically a resource-intensive recurring activity. In this paper, we present the DELTA algorithm for the robust and eecient incremental mining of association rules on large market basket databases. DELTA guarantees...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006